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This paper establishes the global asymptotic equivalence between 
a Poisson process with variable intensity and white noise with drift 
under sharp smoothness conditions on the unknown function. This 
equivalence is also extended to density estimation models by Pois- 
sonization. The asymptotic equivalences are established by construct¬ 
ing explicit equivalence mappings. The impact of such asymptotic 
equivalence results is that an investigation in one of these nonpara- 
metric models automatically yields asymptotically analogous results 
in the other models. 


1. Introduction. The purpose of this paper is to give an explicit con¬ 
struction of global asymptotic equivalence in the sense of Le Cam (1964) 
between a Poisson process with variable intensity and white noise with drift. 
The construction is extended to density estimation models. It yields asymp¬ 
totic solutions to both density estimation and Poisson process problems 
based on asymptotic solutions to white noise with drift problems and vice 
versa. 


Density estimation model. A random vector V* of length n is observed 
such that V* = (Vf ,..., V*) is a sequence of i.i.d. variables with a common 
density /gT. 
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Poisson process. A random vector of random length {N, Xjv} is observed 
such that N = N n is a Poisson variable with EN = n and that given N = m, 
Xjv = X m = (X \,... ,X m ) is a sequence of i.i.d. variables with a common 
density / E T. The resulting observations are then distributed as a Poisson 
process with intensity function nf. 

White noise. A Gaussian process Z* = Z* = {Z*(t),0 < t < 1} is ob¬ 
served such that 

rt _ R*(t) 

( 1 . 1 ) Z*(t)= Vf{x)dx + —=, 0<t<l, 

Jo 2V n 

with a standard Brownian motion B*(t) and an unknown probability density 
function / £ JF in [0,1]. 

Asymptotic equivalence. For any two experiments £i and £2 with a com¬ 
mon parameter space 0, A(£i,£2;0) denotes Le Cam’s distance [cf., e.g., 
Le Cam (1986) or Le Cam and Yang (1990)] defined as 

A(£i, £ 2 ; 0) = sup max sup inf sup \Ep' > L(9,5^) — Eg k ^ L(6, <5®)|, 

L i =1 > 2 sU) $ (k) e>e© 

where (a) the first supremum is taken over all decision problems with loss 
function ||L||oo < 1, (b) given the decision problem and j = 1,2, k = 3 — j 
(k = 2 for j = 1 and k = 1 for j = 2) the “maximin” value of the maximum 
difference in risks over 0 is computed over all (randomized) statistical pro¬ 
cedures for & and (c) the expectations E^p are evaluated in experiments 
££ with parameter 6, i = j, k. The statistical interpretation of the Le Cam 
distance is as follows: If A(£i, £ 2 ; 0) < e, then for any decision problem with 
||L||oo £ 1 and any statistical procedure 8^ with the experiment £.,■, j = 1,2, 
there exists a (randomized) procedure 5^ with £*., k = 3 — j, such that the 
risk of 5^ evaluated in £*, nearly matches (within e) that of 5^ evaluated 
in £,. 

Two sequences of experiments {£i, n , n > 1} and {£2 ,n, n > 1}, with a com¬ 
mon parameter space T, are asymptotically equivalent if 

A(£i, n ,&,„;X)^0 as n —> 00 . 

The interpretation is that the risks of corresponding procedures converge. 

A key result of Le Cam (1964) is that this equivalence of experiments 
can be characterized using random transformations between the probabil¬ 
ity spaces. A random transformation, T(X,U ) which maps observations X 
into the space of observations Y (with possible dependence on an indepen¬ 
dent, uninformative random component U) also maps distributions in £1 to 
approximations of the distributions in £2 via P^T ~Pg 2 \ For the map¬ 
ping between the Poisson and Gaussian processes we shall restrict ourselves 
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to transformations T with deterministic inverses, T l (T{X,U)) =X. The 
experiments are asymptotically equivalent if the total-variation distance be¬ 
tween and the distribution of T under converges to 0 uniformly in 
9. As explained in Brown and Low (1996) and Brown, Cai, Low and Zhang 
(2002), knowing an appropriate T allows explicit construction of estimation 
procedures in £i by applying statistical procedures from £2 to T(X,U). 

In general, asymptotic equivalence also implies a transformation from 
the Pg 2 ^ to the P^ and the corresponding total-variation distance bound. 
However, in the case of the equivalence between the Poisson process and 
white noise with drift, by requiring that the transformation be invertible, 
we have saved ourselves a step. The transformation in the other direction is 
T -1 , and 

||Pg 1} T — Pg 2) || > upW^- 1 _ P ( 2) T - 1 || = || P W _p^ 2) r _1 ||. 

Therefore, it is sufficient if sup# ||P^T — P^ 21 1| —> 0. 

The equivalence mappings T n constructed in this paper from the sample 
space of the Poisson process to the sample space of the white noise are 
invertible randomized mappings such that 

(1.2) sup H f (T n (N, Xjv), Z*) —> 0 

/so¬ 
under certain conditions on the family T. Here Hf(Z\, Z 2 ) denotes the 
Hellinger distance of stochastic processes or random vectors Z\ and Z 2 liv¬ 
ing in the same sample space, when the true unknown density is /. Since 
T n are invertible randomized mappings, T n (N,X jv) are sufficient statistics 
for the Poisson processes and their inverses T'~ 1 are necessarily many-to-one 
deterministic mappings. Similar considerations apply for the mapping of the 
density estimation problem to the white noise with drift problem, although 
in that case there are two mappings, one from the density estimation to the 
white noise with drift model and another from the white noise with drift 
model back to the density estimation model. These mappings are given in 
Section 2. 

There have recently been several papers on the global asymptotic equiv¬ 
alence of nonparametric experiments. Brown and Low (1996) established 
global asymptotic equivalence of the white noise problem with unknown 
drift / to a nonparametric regression problem with deterministic design and 
unknown regression / when / belongs to a Lipschitz class with smooth¬ 
ness index a > It has also been demonstrated that such nonparamet¬ 
ric problems are typically asymptotically nonequivalent when the unknown 
/ belongs to larger classes, for example, with smoothness index a < 
Brown and Low (1996) showed the asymptotic nonequivalence between the 
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white noise problem and nonparametric regression with deterministic de¬ 
sign for a < i, Efromovich and Samarov (1996) showed that the asymp¬ 
totic equivalence may fail when a < j. Brown and Zhang (1998) showed 
the asymptotic nonequivalence for a < 4 between any pair of the follow¬ 
ing four experiments: white noise, density problem, nonparametric regres¬ 
sion with random design, and nonparametric regression with determinis¬ 
tic design. In Brown, Cai, Low and Zhang (2002) the asymptotic equiva¬ 
lence for nonparametric regression with random design was shown under 
Besov constraints which include Lipschitz classes with any smoothness index 
a > Gramma and Nussbaum (1998) solved the fixed-design nonparamet¬ 
ric regression problem for nonnormal errors. Milstein and Nussbaum (1998) 
showed that some diffusion problems can be approximated by discrete ver¬ 
sions that are nonparametric autoregression models, and Golubev and Nussbaum 
(1998) established a discrete Gaussian approximation to the problem of es¬ 
timating the spectral density of a stationary process. 

Most closely related to this paper is the work in Nussbaum (1996) where 
global asymptotic equivalence of the white noise problem to the nonpara¬ 
metric density problem with unknown density g = f 2 /4 is shown. In this 
paper the global asymptotic equivalence was established under the following 
smoothness assumption: / belongs to the Lipschitz classes with smoothness 
index a > 


The parameter spaces. The class of functions T will be assumed through¬ 
out to be densities with respect to Lebesgue measure on [0,1] that are uni¬ 
formly bounded away from 0. The smoothness conditions on T can be de¬ 
scribed in terms of Haar basis functions of the densities. Let 

(1.3) 0 kft = 9 kt e(f)= J £ = 0,... ,2 k — 1, k = 0,1,, 

be the Haar coefficients of /, where 

( L4 ) = 2k/2 ( 1 h+ 1,M ~ t Ik+l,2i+l) 

are the Haar basis functions with I^ i = [£/2 k , (i + l)/2 k ). The convergence 
of the Hellinger distance in (1.2) is established via an inequality in Theorem 
3 in terms of the tails of the Besov norms ||/1|i/ 2 , 2,2 and ||/||i/ 2 , 4,4 of the 
Haar coefficients 0k,e. = ^k,t{f) hr (1-3). 

The Besov norms ||/|| a , p ,g for the Haar coefficients, with smoothness index 
a and shape parameters p and q, are defined by 



(1-5) ||/||o,p,q — 


!/pr <11 V? 
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Let fk be the piecewise average of / at resolution level k, that is, the piece- 
wise constant function defined by 

2 k — l 

(1.6) fk = hit) = £ M2* / /• 

e=o Jlk 

Since \\f k - fk+i\\% = f | = I 0 mI P 2 * (p/2_1) , (1-5) can be writ- 

ten as 

f oo 1/9 

ll/ll™ = l/ol' + E( 2 *“ll A - A+illp)" . 

I fc=0 J 

and its tail at resolution level k o > 0 is ||/ — fk 0 \\a,p,q, > 0, with 

oo ( f 2 k —l 

(1.7) 11/ - U\1,P,« = E 2 t <“ +1 /2- 1 /p) £ 14 >( |J> 

k=ko v \ 1=1 

Let B(a,p,q ) be the Besov space 

B(a,p, q) = {f : ||/||a,p,g < oo}. 

The following two theorems on the equivalence of white noise with drift, 
density estimation and Poisson estimation models are corollaries of our main 
result, Theorem 3, which bounds the squared Hellinger distance between 
particular invertible randomized mappings of the Poisson process and white 
noise with drift models. The randomized mappings are given in Section 2. 
Proofs of these theorems are given in the Appendix. 

Theorem 1. Let Z^, {N, Xjv} and V* be the Gaussian process, Poisson 
process and density estimation experiments, respectively. Suppose that hi is 
compact in both B( 1/2,2,2) and B( 1/2,4,4) and that hi C.{f: info<o;<i f{x) > 
e 0 } for some £o > 0. Then 

(1.8) lim A(Z*{N,X n };H) = 0 

n —kx) 

and 

(1.9) Um A(Z*,V*;W) = 0. 

Our construction also shows that asymptotic equivalence holds for a class 
T if T is bounded in the Lipschitz norm with smoothness index f3 and 
compact in the Sobolev norm with smoothness index a > (3 such that a + /3 > 

1, a>for/3>2- 
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For 0 < f3 < 1 the Lipschitz norm ||/||^ and Sobolev norm ||/||q are 
defined by 


(L) _ 

0 = sup 

0<x<y<l 


\f(x)-f(y)\ 


\X 


-y\ p 


E n 2Q M/)| 2 , 


where c n (f) = f 0 f{x)e m2nx dx are the Fourier coefficients of /. 


Theorem 2. Let Z*, {N, Xjv} and V* be the Gaussian process, Pois¬ 
son process and density estimation experiments, respectively, and let T be 
bounded in the Lipschitz norm with smoothness index (3 and compact in the 
Sobolev norm with smoothness index ot> (3. Suppose T C {/: info< x <i f(x ) > 
e 0 } for some Eq > 0. Then if a + (3 >\, a > | or (3 > ^, 

lim A(Z*,{N,X n }-T) = 0 

n—>oo 

and 

ltaA(Z;,V:;F)=0. 


2. The equivalence mappings. This section describes in detail the map¬ 
pings which provide the asymptotic equivalence claimed in this paper. The 
fact that these mappings yield asymptotic equivalence is a consequence of 
our major result, Theorem 3. The construction is broken into several stages. 

From observations of the white noise (1.1), define random vectors 

( 2 . 1 ) % = (T k/ , 0 < £ < 2 k }, T kjt = 2 fc {z* - Z* }, 

(2.2) W t ee {1 Vh, 0 < l < 2 k }, W k 2 E = -WZ,2i+i = (Zt,2e ~ ^U+i)/ 2 - 

Let ko = fco,n be suitable integers with lim n ^oo ko, n = oo. Following Brown, Cai, Low and Zhang 
(2002), we construct equivalence mappings by finding the counterparts of Zf kQ 
and W k , k > ko, with the Poisson process (N, Xjy), to strongly approximate 
the Gaussian variables. 

It can be easily verified from (1.1) that {Z* kg ^,0 < l < 2 k °, W k2i ,Q <t< 

2 k ~ 1 ,k > ko} are uncorrelated normal random variables with 

EZ k)t = hk,(. = 2 k f h, h = y[f ., 

■' h.j, 

(2 - 3) /-— rr- 

V Var(Z M ) =a k = V 2 fc /(4n), 

for l = 0,..., 2 k - 1, and for t = 0,..., 2 k ~ l - 1, 

EW k ,2e. = \(hk,2t ~ h k ,2l+i) = 'J2 k ~ 1 J hcfk-i/i 


(2.4) 


VVar(Wl 2e ) = a k - 1 . 
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Let U = {U k /,k > ko,£> 0} be a sequence of i.i.d. uniform variables in 
[—1/2,1/2) independent of (N, Xjv). For k = 0,1,... and l = 0,..., 2 k — 1 
define 


(2.5) Nfc = {Nk t i, 0 <£ < 2 fc }, N ktt = #{X i :X i €l k j}. 
We shall approximate Z/ in (2.1) in distribution by 
Z k = {Zk,e, 0 < £ < 2 k }, 

( 2 - 6 ) _ _ -- 

Zk,£ = 2a k sgn (Nk,e + Uk,e)V\Nk : e + Uk,e\, 


at the initial resolution level k = k$. Since N k ,£ are Poisson variables with 



by Taylor expansion and central limit theory 


Zk,e~2ak( VX k,e H- i/ 2 fc ’^ N ) ~ N(y/Jk~i,a\) 

V 2A $ ) 


as Xk/ —> oo, compared with (2.3). Note that a/ fk,e ~ h k / under suitable 
smoothness conditions on /, in view of (2.3) and (2.7). The Poisson vari¬ 
ables Nk/ can be fully recovered from Z k £, while the randomization turns 
Nk £ into continuous variables. 

Approximation of Wto for k > fco is more delicate, since the central limit 
theorem is not sufficiently accurate at high resolution levels. Let F m be 
the cumulative distribution function of the independent sum of a binomial 
variable X m ^_/2 with parameter (m, ^) and a uniform variable U in [— A), 

(2.8) F m (x) = P{X mtl / 2 +U<x}, 

with Fq being the uniform distribution in [—^, A). Let be the A r (0,1) cu¬ 
mulative distribution. We shall approximate W/ by using a quantile trans¬ 
formation of randomized versions of the Poisson random variables. More 
specifically, let 

(2.9) W k = {W k ,e,0<£<2 k }, Wk^Ok-i^-HFN^Nkw + Ukw)) 


withWfc^ = —Wk, 2 £+ 1 , £ = 0,..., 2 k 1 — 1, and theofcin (2.3). Given N k —\,£ = 

_ fl/e,2( f fk,2t 


(2.10) N k;2 £ ~ Bm(rn.p k} - 2 e), Pk, 2 £ = 


Jl k _ 1 1 f fk,2£ + fk,2£+l 


so that W k , 2 £ is distributed exactly according to A r (0,cr|_ 1 ) for Pk, 2 £ = 
compared with (2.4). Thus, the distributions of W k , 2 £ and 2 i are close 


m, 
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at high resolution levels as long as / is sufficiently smooth, even for small 
Nk-i,e = m. 

The equivalence mappings T n , with randomization through U, are defined 
by 

T n : {N, Xjv, U} —► W[ fc0iOO ) —► Z n = {Z n (t): 0 < t < 1}, 

where for ko < k < oo, = {Z ko , ~Wj,ko < j < k}, and Z k and Wfc 

are as in (2.6) and (2.9). The inverse of T n is a deterministic many-to-one 
mapping defined by 

T- 1 :Z*^Wf fe0i0c) -,(iV*,XV), 

where for k 0 < k < oo, = {Z* ko , W*, k 0 <j<k}. 

Remark 1. One need only carry out the above construction to k = 
k\ : 2 kl > en since we shall assume that / £ R(^, 2,2) and then the observa¬ 
tions W^ fc) = {Zfc 0 , W*,fc 0 < j < k} and W [A . 0ifc) = {Z ko , Wj,k 0 < j < k} 
are asymptotically sufficient for the Gaussian process and Poisson process 
experiments. See Brown and Low (1996) for a detailed argument in the con¬ 
text of nonparametric regression. 


Mappings for the density estimation model. The constructive asymptotic 
equivalence between density estimation experiments and Gaussian experi¬ 
ments is established by first randomizing the density estimation experiment 
to an approximation of the Poisson process and then applying the random¬ 
ized mapping as given above. Set 7 *, = supj 6W \\f — f k \\\/2 2 2 an d note that 
since 7i is compact in B( 1/2,2,2), 7 ^ | 0. Now let ko be the smallest integer 
such that 4 k °/n > 7*, 0 and divide the unit interval into subintervals of equal 
length with length equal to 2~ k °. Let f n be the corresponding histogram 
estimate based on V*. Now note that since functions / £ 7i are bounded 
below by £q > 0 it follows that 


(2.11) f 1 (vX - v7) 2 < f 1 (vX - /7) : 

Jo Jo 


! {^7n + \ff) 2 f l {fn-f) 2 


£o 


£o 


Now 

( 2 . 12 ) 


E I (f n — fY = E I ( fn-f ko ) 2 + I (f-fk o) 


I \2 


and simple calculations show that the histogram estimate f n satisfies Ef n (x ) = 
fk Q (x) and Var f n (x) < f ko (x)^E. Hence, 

(2.13) n^E f 1 (f n - f ko ) 2 < n 1 / 2 — < 2 7 * /2 -> 0. 

jo n u 




EQUIVALENCE THEORY FOR DENSITY ESTIMATION 


9 


Now n 1 / 2 < — fj 2 and hence, from (1.7), 

% 

(2.14) n 1 / 2 (/ — fk 0 ) — ~T/2 11^ — /fcolli/2,2,2 — Tfco 0- 

It thus follows from (2.11) to (2.14) that 

(2.15) n 1//2 sup E f (VX-\/7) 2 ^0. 

fen Jo 

Hence the density estimate is squared Hellinger consistent at a rate faster 
than square root of n. 

Now generate N, a Poisson random variable with expectation n and 
independent of V*. If N > n generate N — n conditionally independent 
observations V* +1 ,... ,V^ with common density f n . Finally let (IV, X^) = 

(IV, V*, VJ, ■ ■ ■, V£) and write R\ for this randomization from V* to (IV, X^), 

i£:V^(JV,X*). 

A map from the Poisson number of independent observations back to the 
fixed number of observations is obtained similarly. This time let f n be the 
histogram estimator based on (N,Xn). If N < n generate n — N additional 
conditionally independent observations with common density f n . It is also 
easy to check that 

(2.16) n 1 / 2 sup E f {'/J n - V?) 2 —* 0. 

fen Jo 

Now label these observations V n = (Vi,..., V n ) and write i? 2 for this ran¬ 
domization from (IV, Xjv) to V n , 

i? 2 :(lV,X i v)^V n . 

Remark 2. It should also be possible to map the density estimation 
problem directly into an approximation of the white noise with drift model. 
Dividing the interval into 2 k ° subintervals and conditioning on the number of 
observations falling in each subinterval, the conditional distribution within 
each subinterval is the same as for the Poisson process. Therefore, it is only 
necessary to have a version of Theorem 4 for a 2 k °-dimensional multinomial 
experiment. 

Carter (2002) provides a transformation from a 2 k °-dimensional multino¬ 
mial to a multivariate normal as in Theorem 4 such that the total-variation 
distance between the distributions is O(ko2 k °n ~ 1 / 2 ). The transformation is 
similar to ours in that it adds uniform noise and then uses the square root 
as a variance-stabilizing transformation. However, the covariance structure 
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of the multinomial complicates the issue and necessitates using a multi¬ 
resolution structure similar to the one applied here to the conditional ex¬ 
periments. The Carter (2002) result can be used in place of Theorem 4 to 
get a slightly weaker bound on the error in the approximation in Theorem 3 
(because of the extra fco factor) when the total number of observations is 
fixed. This is enough to establish Theorem 2 if the inequalities bounding a 
and (3 are changed to strictly greater than. It is also enough to establish 
Theorem 1 if TL is a Besov space with a > \. Carter (2000) also showed that 
a somewhat more complicated transformation leads to a deficiency bound on 
the normal approximation to the multinomials without the added fco factor. 


3. Main theorem. The theorems in Section 1 on the equivalence of white 
noise with drift experiments and Poisson process experiments are conse¬ 
quences of the following theorem which uniformly bounds the Hellinger dis¬ 
tance between the randomized mappings described in Section 2. 


Theorem 3. Suppose info <a; <i f(x ) > £o > 0. Let W* fco ,&) = {Zfc 0 , W*,fc 0 < 
j < k} with the variables in (2.1) and (2.2), and = {Zfc 0 , Wj, k 0 < 

j < k } with the variables in (2.6) and (2.9). Then there exist universal con¬ 
stants C, D i and D 2 such that for all k\ > ko, 


ff2 ( w N,h)> w |to ,ta>) 


s £~+ § E * E »h +££ E 2“ E <>h 

e ° n e o k=k 0 e=o e o 4 k=k 0 e=o 

CA ko Di - 2 D 2 n - 4 

S e 0 n + e 2 0 li; /fcolll / 2 , 2,2 + II 1 / 2 ,4,4> 


2 fc —1 


where 9^/ are the Haar coefficients of f as in (1.3), fk is as in (1.6) and 
I! ' || 1/2 ,p,p are the Besov norms in (1.5). 


Remark 3. Here the universal constant C is the same as the one in 
Theorem 4, while D\ = ^ + 2 and D 2 = ^ | for the D in Theorem 5. 


The proof of Theorem 3 is based on the inequalities established in Sections 
4 and 5 for the normal approximation of Poisson and Binomial variables. 
Some additional technical lemmas are given in the Appendix. 

Let X mp be a Bin (m,p) variable, X\ be a Poisson variable with mean A, 
and U be a uniform variable in [—^, 5 ) independent of X m>p and X\. Define 

~g m)P (x) = P{$- l (F m (X mtP + U)) < x} 


(3.1) 
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with the F m in (2.8) and the iV(0,1) distribution function 4>, and define 
(3.2) g\{x) = -^P{2sgn(X x + U)\/\X x + U\ <x}. 

Write (pb for the density of N(b,l) variables. 


Proof of Theorem 3. Let g* ko:k )(w[ko,k)) and g [ko ^{ w [fe0ifc) ) be the 
joint densities of Wt k ^ and W[ fe0ifc ), g k (wk) be the joint density of W£, 
and 5 fe(wfc|wr fco fc )) be the conditional joint density of W k given 
Since W£ is independent of W* fco k y 

^9[ k Q,k)g[ko,k) ^g[ko,k+l)g[ko,k+l) ^g[k 0t k)g[ko,k) (1 ^-Ikflk ) j 

so that the Hellinger distance can be written as 


(3-3) 


j ^g[koM)^[ k oM) 

2(1 / vg [ k 0 , k 0 + i ) g [ ko , ko + i ) 


+ X] 2 / ^9[k 0 ,k)g[k 0 ,k) ( 1 - / Vgtgk J 

- - - J \ J / 


ko<k<k\ 



At the initial resolution level ko, N ko £ are independent Poisson variables 
by (2.5), so that Z k0: £ are independent. This and the independence of Z k(j (, 
from (2.1) imply 


2 k 0 -1 

H 2 f (zt 0 ,Z k0 )< Hj(T ko/ ,Z ko/ ). 


i =o 


By (2.6) and (3.2) Z k0t e/a ko have densities g\ ko , e , while ~Zl 0ti /a ko are N(h k0t e/cr koi 1) 
variables by (2.3). Thus, Theorem 4 can be used to obtain 


C 


Hf{Zk 0 /iZ ko /) Hf(g\ kol i l Phk 0 ,e/(rk 0 ) — \ 

A fco, 



2 


Since \ k / = fk,e/(^& k ) by (2.7) and a k = 2 k 2 /n by (2.3), the above calcu¬ 
lation yields 


_ 2 fc 0 -l kQ 

*)<c Y. rj— 

£=Q u Jk 0 ,£ 


2 k o — l 

+ ^(vT^ 

e=o 


hk 0 ,e ) 


2 
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by Lemma 1 (i) and the bound / > e 0 • 

For k > ko and 0 < i < 2 k ~ 1 — 1, define 


(3.5) fj, k ,2i — y/mk,2e(2pk,2£ ~ 1), Pk,2t — V^k-l,i(2pk,2£ ~ 1), 

where p k , 2 i are as in (2.10), X k / = f k /n/2 k are as in (2.7), and the func¬ 
tions m ky2 £ = mfc, 2 f(w[ fe0ifc )) are defined by N k _i t e = m k , 2 i( w [k 0 ,k))- At a 
fixed resolution level k > ko, and for £ = 0,. .., 2 fc_1 — 1 , N k 2 t are indepen¬ 
dent binomial variables conditionally on Wij^h, so that by (2.9) and (3.1) 
W k> 2 e/ a k-\ are independent variables with densities 9m k2t) p k2l under the 
conditional density g k . In addition, 2 ( are independent normal variables 
with variance cr k ~\ under g k . Thus, 


2 —i 


(3.6) 


H (g k i9k)< ^2 H (9m k ^ 2e ,p ki2e i <P/3* 


£=0 


by (2.4), where Pt n = EW^ 2 i/ a k-i = \/4 n f h<pk-i/. It follows from Theo¬ 
rem 5 and (3.5) that for fixed w [k 0)k ), 

H 2 (~g„ 

(3.7) 


J™k, 2 l,Pk, 2 V 


< D 


Pk,2£ ~ ~ 


+ ™k,2l 


Pk,2t ~ £ 


4 ] (Pk,2l Pk,2l) 2 


Furthermore, it follows from Lemma 3 that 

/ ^9[ ko ,k)9[k 0 ,k)(V m ^2£ - V^k-i,e] 


< 


\j J 9[k 0) k)W mk ^ - V^k-l,e) 4 


— \JE(y/Nk-i,e ~ \fX k -i,i) A < 2 , 

so that by (3.5), 

J 'J 9{ko,k)9[ko,k)(Pk, 21 ~ Pk,2i) 2 < ^Pk,2£ ~ l) 2 + 2 {Pk,2£ ~ Pk,2i) 2 ■ 

Similarly, / 2^k) m K2i < VEN%_ hi < X k -i/ + 1/2. Thus, by (3.7), 

/ ^9[k 0 ,k)9[ko,k)H (9m k}2 e,p k} 2ei < £ > P k2t ) 

(3.8) 

< - \] 2 + DXk-l/\pk,2£ - |] 4 + (Pk,21 - Pk,2f) 2 1 
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with D\ = 3D/8 + 2. Now, by (2.10) and (1.3), 

1 flk,2i f ^Ik,21+1 'f s /2f ^&k—l,£ 


(3.9) 


Pk,2£ 


2 2 W 2 f«-w ’ 

so that by (3.5), (2.7), the definition of 2( in (3.6) and Lemma 1 (ii), 


\Pk,2£ - Pk,2e\ = 


I nfk-i,e V2 k l 6k- i,i 
2 fc - x fk- 1 / 


- \fin / hcj> k _\f 


(3.10) 


= \[An 


h . = — f h ,( f ) k - 1 £ 

< V4^2 (fe - 1)/2 - 1 / fe "_ 3 0 / (/ - f k -i,e) 2 - 

J Ik-1, t 

Inserting (3.9) and (3.10) into (3.8) and summing over i via (3.6), we find 

J ^ 9[k 0 ,k)9[k 0 ,k)H i,9ki9k) 

2 fe_1 —1 

— E J ^9[k 0 ,k)9[k 0 ,k) ( 9m k ,2t ,Pk, 2 £ ) ^ 01 , 21 ) 


t=0 


2 R-I_1 


(3.11) < E 

1=0 


AD 


2 Ok-1,1 

1 _ q72 


+ HAfc -11 


4 fc lii 


8/jf-M ’64 


+ 


n2* 


2 /fc-M 


( [ (/ ~ fk- 1^) 2 ) 

h-ll / 


2n 


2 r-i_i 


< E 


=0 L £ 0 


411 r.fc-1 


r ^2 fc - 1 02 +(7L + 1 


D \ n2 


2 - - re-i,t ■ \ -j^g 

fc • /n _J 1)2 


>/c—1 


(/ - fk-l,i)‘ 


e 0 \ I Ik -11 

due to = nfk,i/2 k in (2.7) and B\ t < f Ik ^{f ~ fk/) 2 - 

Finally, inserting (3.4) and (3.11) into (3.3) and then using Lemma 2 
yields 

r 2/ w * 


44/( W ffco,fci+i)’ W [fc 0 ,fci+i)) 

o2fco 7~) k\— 2 2 k — 1 

ne ° e 0 fc=fc 0 £=0 


+ 


E 3 

16 


Q\ ^1 22 _1 rp ok / /* 

+ EEf / (/-AA 


k=ko 1=0 £ 0 
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ri Ak 0 n 2— 1 7-) oo 2— 1 

+ 2 ‘ E "1/ + E 2 * E «iU 

k=kg 1 =0 


eo n eg —o S £ o 4fc0 


with L >2 = (jr + §)/(! — t ) 2 = tt + I and the theorem follows. □ 


i\2 _ D , 8 


4. Approximation of Poisson variables. Let X\ be a Poisson random 
variable with mean A and U be a uniform variable on [— 5 , 5 ) independent 
of A\. Define 

(4.1) Z x = 2sgn(X x + U)V\X x + U\, ~g x (y) = ^-P{Z X < y}. 

ay 

The main result of this section is a local limit theorem which bounds the 
squared Hellinger distance between this transformed Poisson random vari¬ 
able and a normal random variable. 


Theorem 4. Let Z x and g x be as in (4-1)- Let ~ JV(2\/A, 1) and 
be the density of N(0,y). Let be the Hellinger distance. Then, as 

X —> 00 , 

(4.2) H\Z X , Z{) = H\~g x , ^) = (7 + o(l))^. 

Consequently, there exists a universal constant C < 00 such that 

(4.3) H 2 (g x ,ip[ l ) <C/X + (2a/X — y) 2 /2 VA>0 ,y. 

Remark 4. The theorem remains valid if Z x is replaced by 

Z' x = 2s]x x + U + \, 

since H 2 (Z X , Z' x ) is bounded by 

[■ _ f 00 / \2j+l 1 

2 " 2 J ^\x x +u\h x +u+i /2 - 2 ~ j 1 + V j\(j + 1)! J 

=1 -- E i/? £min ( 1 ’ f)- 

Proof of Theorem 4. The second inequality of (4.3) follows immedi¬ 
ately from (4.2), since Lf 2 (<£> M1 , ip M2 ) = (/H — 112) 2 / 4 [cf. Brown, Cai, Low and Zhang 
(2002), Lemma 3] and H 2 {g\-,Tii) < 2. 

Let t(x) = 2 sgn(x)\/jx[, a strictly increasing function. Define 

(4.4) X{ = t-\Z* x ) = sgn(^)(^) 2 /4. 
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Let f\ and f x denote the densities of X x + U and X*, respectively. Since 
t(-) is invertible, H' 2 (Z X , Z x ) = H{X x + U,X^) = 2 — 2/ V fxf x , so that it 
suffices to show 

(4.5) A x = JVJJl= 1-Q limC x = ^. 

Since U is uniform, f\(x) = e~ x X/j\ on [j — 1/2, / + 1/2), so that 

00 rj+ 1/2 

(4.6) A x = ^2fx{j) {f\{x)/f\ti)} 1/2 dx. 

U 3 Jj ~ 1 / 2 


Since t'(x ) = |rc| 1//2 , by (4.4) f{(x) = |x| 1/,2 </>(t(x) — 2-y/A). This gives 
IM. exp{-(2^- 2VA) 2 /2} _ [2 ■ , (x)] j _l <x<j+ i 

AO) V2^e~ x \i/j\ PW,tJJ ’ J 2 “ J + 2 ’ 

for j > 1, in view of the Stirling formula /! = v / 2vrj' :,+1 ^ 2 exp(—/ + e^), where 


(4.7) ip j (x) = -(Vx~V A) 2 -^|^ + ^ + ilog 


+ 


logj _/,£? 
4 2 + 2 


with 1/(12/ + 1) < £j < 1/(12/), for / = 1,2,— Now, by the mean-value 
theorem, 


fj+l/2 f f*(x) ]l/2 

4-1/2 1 /aO') J 

0 + 1/2 


dx 


0 - 1/2 


exp 


V’/O) + $0)(* - j) + ^4-^( x - J') 2 


dx 


for some |x ? - — j\ < A, with 


(4.8) 


V4O0 = 


x 4x 


V’( / (^) = -77^+ 1 


2x3/2 4 x 2 

Since exp['<//■(/) + (x'j)(x — j) 2 /2] is symmetric about /, it follows that 

r +1/2 j IM\ 1/2 , 

4 - 1/2 


(4.9) 


1 AO) j 

0+1/2 


dx 


0 - 1/2 


exp 


/ ^(*j)(* - J') 2 ] ^ tyj.0)(x - j)) 2fc 

V’j'O) + —- 5 - 2^-rrrr;-dx. 


k=0 


(2k)\ 


Now, we shall take uniform Taylor expansions of i^j and their derivatives 


m 


J a = {/:|//A-1|<A- 2 / 5 }. 
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By (4.7), ipj(j) = Xip(j/X) +£j/ 2 with 
ip(x) =-(y/x - 1 f + 


2 1 — X X 


2 +2 l0gI ' 


Since i/)( 1) = = 0, ip'"( 1) = 1/4 and ip""( 1) = — 

/A A(j-A) 3 7X(j — A) 4 


AO 


X 4 3! A 3 


8 4! A 4 


-(l + o(l)) = o(l). 


Since 1/(12j + 1) < Ej < 1/(12 j), Ej/2 = (1 + o(l))/(24A) = o(l). Thus, 

A ( i) = ^4^ - + °( x )) + = °(!) 

uniformly in J\ as A —> oo. Similarly, by (4.8) and \xj — j\ < ^, 


0 ~ A) 2 , o(l) 

4A 2 


{^i(j)} 2 = (l + o(l)) 

in f \ —l+o(l) , ^ 

A (*i) =-2A- = ° 1 ' 


+ 


A 


= °( 1 ), 


These expansions and (4.9) imply that uniformly in J\, 

r j+1/2 (f A *ooi 1/2 


h - 1/2 l / a ( j ) J 

P+l/2 


dx 


0-1/2 L 


1 + + Wj ( x 3 ) + (OjO)) } 


(x-j) 2n 


dx 


0 - A) 
A fc+1 


2fc 


+»(i)E 

k =0 

, 0 ~ A) 3 7 (j — A) 4 11 

24A 2 8 24A 3 24A 24 


-1 . 0 ~ A) 21 
2A 4A 2 


+o(i)E 


k=0 


0 - A) 
A fc+1 


2fc 


as Ij_i /2 (x — j) 2 dx = Since /a O’) is the Poisson probability mass func¬ 
tion of Ala, 

r j+1/2 (Fx(x )\ 1/2 


(4.10) 


J 2 / a ( j ) , 
= 1 + 


7—1/2 l /AO') J 


24A 


dx 

1 


_1 o(l) 7 + o(l) 

24A 24A 96A A 


192A 
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as Sjgj a /aO') = 1 + o(l/A). Note that E(X\ — A) 3 = A and E(X\ — A) 4 = 
3A 2 + A. Hence, (4.5) follows from (4.6), (4.10) and the fact that 


/a O') 

HE 


ro+i /2 f l V 2 /----- 

|^-y) dx < y P{X\ i Jx}P{X* x (£Jx} = o 


□ 


5. Approximation of binomial variables. The strong approximation of 
a normal by a binomial depends on the cumulative distribution function 
F m in (2.8). The addition of the independent uniform U in (2.8) to the 
binomial X m ^/2 makes the c.d.f. continuous and thus <h _1 o F rn is a one- 
to-one function on (—^,m+ ^) that maps symmetric binomials to standard 
normals. 

Let ip b be the N(b, 1) density and g m>p be the probability density of 
(5.1) $~ 1 (F m [X mtP + U]), -Ar m ,p~Bin (m,p), 

as in (3.1), where U is an independent uniform on [— ^ |). 


Theorem 5. There is a constant C\ > 0 such that, for all m> 0, 

r _ / j-j'2 \ 

(5.2) H 2 (g miP ,(f b ) = I (Vg m , P ~ Vw>) 2 dz < C\ ( — + ^ ), 


where b= {y/rn/2) log(p/(l —p)). Consequently, 


(5.3) H 2 (g m>p , ipp) < D 


P- 


+ m\p — 


4n 


{^Ei(2p - l) - (3) 2 
2 


Proof. The case when m = 0 is trivial because X = 0 with probability 
1 and therefore <)o,p is exactly an jV(0,1). Thus, the following assumes that 

m > 1 . 

It follows from (3.1) that 

(5-4) g m , P (z ) =p j (l -p) m_J 2 n Vo(z), 


where j = j(z) is the integer between 0 and m such that 
(5.5) Q-'iFmU - B < * < + £)]• 

Let 0 = log(p/g) so that 


, 5m, P (z) m 


+ 


m log(4 pq) 


and the second term can be approximated by 


q2 qA 

(5.6) - — - — < log(4 pq) = - log 


2 + e° + e 


-6-i 


e 2 9 4 

<- 1 —. 

- 4 32 
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Let h\{6) = (2 + e~ d + e~ 0 )/4. The second inequality in (5.6) follows from 
log(/ii(0)) > log(l + 0 2 /4) > 0 2 /4 — # 4 /32. The first inequality in (5.6) follows 
from h\{9) < 1 + 9 2 /4 + 0 4 /24 for |0| < 4, and from log(hi(0)) < |0| < 9 2 / 4 
for |0| >4. Now, let 


/ / / \ j( z ) ~ m/2 Jm 

(5.7) 2 = z (z) = - ■=-. - and b = 9 -. 

v ; v ' y/rnf 2 2 

Then for some —1/24 < h 2 (9) < 1/32 the log ratio is 

i 9m,p( z ) /, b , . 4 

log- = zb - — + h 2 {9)m9 . 

tp 0 {z) 2 

The log ratio of normals with different means is log (cpo/ipb) = —zb + b 2 / 2. 
Therefore the ratio with respect to the normal with mean b is 


(5.8) 


log 


9m 


9>b 


= h 2 (9)m9 4 -b(z- z'), \h 2 (9)\<±. 


•p 


Since y\og(x/y) < x — y < a: log (x/y), for all positive x and y, 
lzVg m ,p log <x^Pb~ \/ 9m, P < 

so that by (5.8), 

((jp,mi <Pb) — ^ j | 1 °§ ^ 


(5.9) 


< 


m9 

~2A 


4 \ 2 


fb 

9m,p 

b 2 


(}Pb T 9m,p) dz 


b r 

+ — (z- z') 2 (<Pb + 9m, p) dz. 


It follows from Carter and Pollard (2004) that the difference between z 
and z! = z'(z) is bounded by 

_ ~/| < / C 2 [m~ x l 2 +m~ 1 |~j 3 ), for all z, 

if \z\ < \/2m, 


\C 2 (m l / 2 + m 1 |z /|3 '' 


(5.10) 

for some constant C 2 . Thus, 

(5.11) l (z z ) 9m.p dz ' 2C 2 ~ f |- f ~9m,p dz — f 2 9m,p dz \ . 

J \mjm z Jz 2 >2mm z J 

Since / g m ,pl{z' = {j - m/2)/y/rri} dz = P{X m:P = j}, 


J\z f \ 6 g m ,p dz = E ^ 


Xm,p m/2 


m 


= O ( 1 + mr 


P~ 1 


) 6 ) =o(i + b e ) 


uniformly in ( m,p ). It follows from (5.4) that 

z 6 g m ,pdz< 2 m ( z 6 cpodz = 0(2 m m 6 e~ rn ) = 
J z 2 >2m 


L 


z 2 >2m 
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The above two inequalities and (5.11) imply 

J(z — z') 2 g m)P dz < 2 C < 2 2 0 (l/m + b e /m 2 ). 


Similarly, f(z — z') 2 (pi,dz < 2C2 2 0(l/m + b 6 /m 2 ). Inserting these two in¬ 
equalities into (5.9) yields (5.2) in view of (5.7). 

Now let us prove (5.3). The Hellinger distance is bounded by 2, so that 
b s /m 2 in (5.2) can be replaced by b 4 /m and it suffices to consider |p — 
11 < -j for the proof of (5.3). By inspecting the infinite series expansion of 
l°g(f) = l°g(l + x) — log(l — x) for x = 2p — 1, we find that for \p — || < |, 

| log(|)| < ||2p — 1| and | log(|) — 4 (p — |)| < ||2 p— 1| 3 . These inequalities, 
respectively, imply 


b 2 b 4 16, x2 256 , x4 

+ —<—(2p-l) 2 + —m(2p - l ) 4 


m 




9 


81 


and \b — y/m(2p— 1)| 2 < ^m\2p-- 1| 6 < ^-m| 2 p— 11 4 , in view of the definition 
of b , which then imply (5.3) via (5.2) and the fact that H 2 ((pb — ipg) = (6 — 

£074. 

□ 


APPENDIX 


A.l. The Tusnady inequality. The coupling of symmetric binomials and 
normals maps the integers j onto intervals [/3j, /9j+i] such that the normal (m/2, m/4) 
probability in the interval is equal to the binomial probability at (™) 2~ J . 

Taking the standardized values 

_ 2(/3j — m/2) _ 2(j-l/2-m/2) 

Zj — , Uj — , 

-y/m •Jm 


Carter and Pollard (2004) showed that for m/2 < j < m and certain univer¬ 
sal finite constants C± 


C _ 


Uj + 1 


m 


< Zj - Uj 



log(l — u 2 Jm ) 


2 cuj 


<C 4 


■ logm 


m 


where c= \/ 21 og 2 and 7 is an increasing function with 7 ( 0 ) = 1/12 and 
7 ( 1 )= log 2 - 1 / 2 . 

This immediately implies that 

(A.l) \z« — uA < — (|u 7 | 3 + logm) V^-<- 

m m2 

for a certain universal constant Co < 00 . We shall prove (5.10) here based 
on (A.l). Because of the symmetry in both distributions, it is only necessary 
to consider z> 0. 
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It follows from (5.5) and (5.7) that 

Zj < Z < Zj- |_l <++* Uj < z' = z'{z) < Uj- |_ 1 . 

Let Zj < z < Zj- |-i. Since Uj + \ — Uj = 2/y^m, for uj +1 < m/2 (A.l) implies 


(A.2) |* - z '| < & - «il v |^+i - « j+ il + /=+ c;(-L + N ) ■ 

Since Uj and Zj are both increasing in j, it follows that (z A z')/y/m are 
uniformly bounded away from zero for Uj+\ > \/m/2, so that 


(A.3) |z — z'\ < |Zj — Uj\ V \zj + i — Uj. |_i| + 


m 


<Co- 


\zr A \z‘ 


m 


/13 


for (m + 1 )/\frn = u m+ 1 > Uj + 1 > m/2 and z < \/2m. Since z V z' < z V 
u m -\-i < \/2z for z > y/2m, (A.2) and (A.3) imply 

I _ /, < J ^(m -1 / 2 + m _1 |z| 3 ), for all z, 

— 1 C 2 (m -1 / 2 + m _1 |z'| 3 ), if \z\ < y/2m, 

for a certain universal C 2 < 00 , that is, (5.10). 

A.2. Technical lemmas. The following three lemmas simplify the rest of 
the proof of Theorem 3. 

Lemma 1. (i) Let fk/ and hk/ be as in {2.7) and {2.3). Then 

(A.4) 0 < Vfk,i - hk,t < 2 fc_1 f k / 2 f (/ — fk,i) 2 - 

(ii) Let 9k/ be the Haar coefficients of f as in {1.3). Then 

Qkl 


(A.5) 


h<fk,i - 


2y/fk,e 


< 2 fc/ 2 " 1 / k,l /2 [ (/-/m) 2 - 

•> h,e 


Proof. Let T = (/ - fk,e)/fk,e > ~L By algebra, 

rji rp rj~i 2 


Vl + T-1 = 


1 + y/l + T 2 2(1 +TITT) 2 ' 

It follows from (2.3) and (2.7) that 
h k ,i = 2/ vT+T 

(/ - fk,e ) 2 


Ik,£ 


= T 


(i + L-Ai 

hk,e V z.Jk/ 


2/|,<(i + ATT)V’ 
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which implies (A. 4) as 2 k Jj k t = 1 and by (2.7) J Ik f (f — f^g) = 0- For (ii) we 
have 

rr~ t\ (i , f ~ fw (/ _ / m ) 2 

= V Jk,l / (Pk,e 11 + 


2 fkjt 2/2 £ (l + ^TTT) 2 

which implies (A.5) as / cf>k,t = 0 and |0fc^| < by (1.4). □ 


Lemma 2. Let O^g be the Haar coefficients in (1.3) and fk/ be as in 
(2.7). Then 


oo 2 fe -l 

E* fc E 

k=ko £=0 


h,£ 

Proof. Define 


2 r\—cko °° 2 k — 1 

d-h/f) < „ , E 2fc|1+c) E + Vc > 0 - 


(1 1 / 2 C ) 2 


e _ / 1) If il.J ikj.- 

— I o, otherwise. 

Since $i,j,k,e = 2*~ A: for i>k, using Cauchy-Schwarz twice yields 

2 / oo 2' —l \ 2 

, (/ - fk) 2 ’ 

t Ih 


(f-fk?) = EE^,*A 

\ j =0 y 


< 


2 “ —1 


1 / 2 - 


E 2 _ic/2 2 ic 2*- fc E 5 i,j,k/ 6 i,j 
V j=0 


_ 


oo oo 


2*-l 


<E 2 _<c E 2 iC 2 i_fc E^-.M^ 

i=k i=k j=0 


2*—1 


o-fc(l+c) 0 ° 

1 i=k j=0 


Since ^T^Lq 1 = 1 f° r * + k, the above inequality implies 


oo 2 fc -l , 

E 2 ‘ E ( , (/ - /m) : 

fc=fc 0 £=o 

9 -fc(l+c) oo 


k=k 0 


2*—1 2 fc —1 

EE 

j =0 £=0 


_ 1/2c E2 ,(1+e) E EvX 

' i=k 
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ck 


2 * —1 


■Eii 

i=ko \ k=ko 


- 1/2 C 


12* (1+c) ]T efj 


3=0 


< 


f)—cko 00 2 ^ — 1 

£ V 2 i ( 1+c ) V o 4 
(1 - 1 / 2 C ) 2 ^ ^ «■ 


□ 


Lemma 3. Let X\ be a Poisson random variable with mean A. Then 

E(Vx~ x -V\f <4. 


Proof. Since E(X x - A ) 4 = A(3A + 1), 


E(V5T x -V A) 4 < 

< 


EjX x - A ) 4 
(VA + 1 ) 4 
3A + 1 


+ \ 2 P(Xx = 0) 


A + 6 


+ 1 <4. 


□ 


A.3. Proof of Theorem 1. First note that 

H(T n R l n X* nl Z*) < H(T n Rky* n , T n (N, Xjv)) + tf(T n (JV, Xjv), Z*) 

and 

fL(V*,^T-iz;)<iL(v;,^ (iV) X JV )) + fL( J R2 (i v,X N ), J R2r-iz*)). 

Note also that since for any randomization T and random X and Y, H (TX, TY ) 
H(X,Y ), it follows that 

H{T n RlY* n , T n (N, Xjv)) < if(i£V£, (IV, X*)) 

and 

Xiv), -R^T“ 1 Z*) < H((N, Xat), T~ 1 Z*) = H(T n (N, Xjv), Z*). 

For the class TL and the randomizations f ? 4 and R^ it follows from (2.15), 
(2.16) and the proof of Proposition 3 on page 508 of Le Cam (1986) that 

sup -H(i ? 4 V*, (N , Xjv)) —> 0 
fen 

and 

sup tf(V*, i^(lV, Xjv)) -*• 0. 
fen 

Hence (1.9) and (1.8) will follow once 
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is established. 

By Theorem 3, for (A. 6 ) to hold it is sufficient to show that 



If the class of functions Ti is a compact set in the Besov spaces, then the 
partial sums converge uniformly to 0 , 


SUP ||/- /fc||l/2,p,p^0 


for p = 2 or 4 as k —> oo. This implies that there is a sequence 7 ^ —s > 0 such 


that 7 fc x sup feH ||/ - f k 11 1 / 2 , 4 , 4 —+ 0. To be specific, let 

7fc = sup ||/ — /fc||i /2j4j4 - 


/eW 


It is necessary to choose the sequence of integers fco(n) that will be the 


critical dimension that divides the two techniques. Let ko be the smallest 
integer such that ^ > 7 k 0 . Therefore, fco(n) —> 00 , and as n —► 00 , 





□ 


0 . 


A.4. Proof of Theorem 2. Theorem 2 follows from Theorem 1 and the 
fact that the Lipschitz and Sobolev spaces described are compact in the 
Besov spaces. 

The Lipschitz class is equivalent to Ba jOC)OC and therefore is compact in 
0 i/2 tPiP if P>\- The Sobolev class is equivalent to B a , 2,2 and 



n 


where C a depends only on a. Thus if T is compact in Sobolev(a) for a > ^ 


then it is compact in 01 / 2 . 2 , 2 - 

Further restrictions are required to show that the Sobolev(a) class is 
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Therefore, for T bounded in Lipschitz(/?), a compact Sobolev(a) set is also 
compact in 61 / 2 , 4,4 if a > 1 — /?. 

Finally, if T is compact in Sobolev(a), a > 3/4, then it immediately fol¬ 
lows from the Sobolev embedding theorem that the function is bounded in 
Lipschitz(l/4) [e.g., Folland (1984), pages 270 and 273], and it follows that 
T is compact in 61 / 2 , 4 , 4 - D 
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